Word statistics in Blogs and RSS feeds: Towards empirical universal evidence
نویسندگان
چکیده
We focus on the statistics of word occurrences and of the waiting times between such occurrences in Blogs. Due to the heterogeneity of words’ frequencies, the empirical analysis is performed by studying classes of ”frequently-equivalent” words, i.e. by grouping words depending on their frequencies. Two limiting cases are considered: the dilute limit, i.e. for those words that are used less than once a day, and the dense limit for frequent words. In both cases, extreme events occur more frequently than expected from the Poisson hypothesis. These deviations from Poisson statistics reveal non-trivial time correlations between events that are associated with bursts of activities. The distribution of waiting times is shown to behave like a stretched exponential and to have the same shape for different sets of words sharing a common frequency, thereby revealing universal features.
منابع مشابه
Cobra: Content-based Filtering and Aggregation of Blogs and RSS Feeds
Blogs and RSS feeds are becoming increasingly popular. The blogging site LiveJournal has over 11 million user accounts, and according to one report, over 1.6 million postings are made to blogs every day. The “Blogosphere” is a new hotbed of Internet-based media that represents a shift from mostly static content to dynamic, continuously-updated discussions. The problem is that finding and tracki...
متن کاملBlogs Search Engine Using RSS Syndication and Fuzzy Parameters
The rapid development of the internet eventually increases the number of internet users triggering the need for an intelligent search engine that is able to minimize the search on world wide web (WWW) and find relevant information as requested. To overcome the issue of finding relevant information as well as minimizing the search on WWW, this paper proposes a search engine that is specifically ...
متن کاملRevealing Student Blogging Activities Using RSS Feeds and LMS Logs
Blogs are an easy-to-use, free alternative to classic means of computer-mediated communication. Moreover, they are authentically aligned with web activity patterns of today’s students. The body of studies on integrating and implementing blogs in various educational settings has grown rapidly recently; however, it is often difficult to distill practical advice from these studies since the applic...
متن کاملRSS Feed Recommendation
Introduction Really Simple Syndication (RSS) Feeds allows users to access blogs and articles in an easy to read format. It cuts out the overhead of navigating websites for content and allows users to get information more quickly. Currently, the user is in total control of their RSS feeds, adding and deleting feeds according to their tastes. This requires the user to actively search out RSS feed...
متن کاملFoafing the Music: Bridging the Semantic Gap in Music Recommendation
In this paper we give an overview of the Foafing the Music system. The system uses the Friend of a Friend (FOAF) and RDF Site Summary (RSS) vocabularies for recommending music to a user, depending on the user’s musical tastes and listening habits. Music information (new album releases, podcast sessions, audio from MP3 blogs, related artists’ news and upcoming gigs) is gathered from thousands of...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- J. Informetrics
دوره 1 شماره
صفحات -
تاریخ انتشار 2007